OcrV1, Main, Exploration, bibRecord, 000D16

Geometric Rectification of Camera-Captured Document Images

Identifieur interne : 000D16 ( Main/Exploration ); précédent : 000D15; suivant : 000D17

Geometric Rectification of Camera-Captured Document Images

Auteurs : JIAN LIANG [États-Unis] ; Daniel Dementhon [États-Unis] ; David Doermann [États-Unis]

Source :

IEEE transactions on pattern analysis and machine intelligence [ 0162-8828 ] ; 2008.

RBID : Pascal:08-0175226

Descripteurs français

Pascal (Inist)
- Intelligence artificielle, Analyse forme, Reconnaissance optique caractère, Reconnaissance caractère, Image tridimensionnelle, Texture, Traitement flux donnée, Caméra vidéo, Courbure, Analyse documentaire, Analyse texture, Rectification, Scanneur, Projection perspective, Frontal, Métrique, Etalonnage.
Wicri :
- topic : Intelligence artificielle.

English descriptors

KwdEn :
- Artificial intelligence, Calibration, Character recognition, Curvature, Data flow processing, Document analysis, Frontal, Metric, Optical character recognition, Pattern analysis, Perspective projection, Rectification, Scanner, Texture, Texture analysis, Tridimensional image, Video cameras.

Abstract

Compared to typical scanners, handheld cameras offer convenient, flexible, portable, and noncontact image capture, which enables many new applications and breathes new life into existing ones. However, camera-captured documents may suffer from distortions caused by a nonplanar document shape and perspective projection, which lead to the failure of current optical character recognition (OCR) technologies. We present a geometric rectification framework for restoring the frontal-flat view of a document from a single camera-captured image. Our approach estimates the 3D document shape from texture flow information obtained directly from the image without requiring additional 3D/metric data or prior camera calibration. Our framework provides a unified solution for both planar and curved documents and can be applied in many, especially mobile, camera-based document analysis applications. Experiments show that our method produces results that are significantly more OCR compatible than the original images.

Affiliations:

Links toward previous steps (curation, corpus...)

to stream PascalFrancis, to step Corpus: 000287
to stream PascalFrancis, to step Curation: 000497
to stream PascalFrancis, to step Checkpoint: 000238
to stream Main, to step Merge: 000D28
to stream Main, to step Curation: 000D16

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Geometric Rectification of Camera-Captured Document Images</title>
<author><name sortKey="Jian Liang" sort="Jian Liang" uniqKey="Jian Liang" last="Jian Liang">JIAN LIANG</name>
<affiliation wicri:level="2"><inist:fA14 i1="01"><s1>Amazon.com, 701 5th Ave. #614.B</s1>
<s2>Seattle, WA 98104</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Washington (État)</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Dementhon, Daniel" sort="Dementhon, Daniel" uniqKey="Dementhon D" first="Daniel" last="Dementhon">Daniel Dementhon</name>
<affiliation wicri:level="4"><inist:fA14 i1="02"><s1>Institute for Advanced Computer Studies, University of Maryland, 3449 AV Williams Building</s1>
<s2>College Park, MD 20742</s2>
<s3>USA</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Maryland</region>
<settlement type="city">College Park (Maryland)</settlement>
</placeName>
<orgName type="university">Université du Maryland</orgName>
</affiliation>
</author>
<author><name sortKey="Doermann, David" sort="Doermann, David" uniqKey="Doermann D" first="David" last="Doermann">David Doermann</name>
<affiliation wicri:level="4"><inist:fA14 i1="03"><s1>Laboratory for Language and Media Processing, Institute for Advanced Computer Studies, University of Maryland, 3451 AV Williams Building</s1>
<s2>College Park, MD 20742</s2>
<s3>USA</s3>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Maryland</region>
<settlement type="city">College Park (Maryland)</settlement>
</placeName>
<orgName type="university">Université du Maryland</orgName>
<placeName><settlement type="city">College Park (Maryland)</settlement>
<region type="state">Maryland</region>
</placeName>
<orgName type="university" n="3">Université du Maryland</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">08-0175226</idno>
<date when="2008">2008</date>
<idno type="stanalyst">PASCAL 08-0175226 INIST</idno>
<idno type="RBID">Pascal:08-0175226</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000287</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000497</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000238</idno>
<idno type="wicri:doubleKey">0162-8828:2008:Jian Liang:geometric:rectification:of</idno>
<idno type="wicri:Area/Main/Merge">000D28</idno>
<idno type="wicri:Area/Main/Curation">000D16</idno>
<idno type="wicri:Area/Main/Exploration">000D16</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Geometric Rectification of Camera-Captured Document Images</title>
<author><name sortKey="Jian Liang" sort="Jian Liang" uniqKey="Jian Liang" last="Jian Liang">JIAN LIANG</name>
<affiliation wicri:level="2"><inist:fA14 i1="01"><s1>Amazon.com, 701 5th Ave. #614.B</s1>
<s2>Seattle, WA 98104</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Washington (État)</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Dementhon, Daniel" sort="Dementhon, Daniel" uniqKey="Dementhon D" first="Daniel" last="Dementhon">Daniel Dementhon</name>
<affiliation wicri:level="4"><inist:fA14 i1="02"><s1>Institute for Advanced Computer Studies, University of Maryland, 3449 AV Williams Building</s1>
<s2>College Park, MD 20742</s2>
<s3>USA</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Maryland</region>
<settlement type="city">College Park (Maryland)</settlement>
</placeName>
<orgName type="university">Université du Maryland</orgName>
</affiliation>
</author>
<author><name sortKey="Doermann, David" sort="Doermann, David" uniqKey="Doermann D" first="David" last="Doermann">David Doermann</name>
<affiliation wicri:level="4"><inist:fA14 i1="03"><s1>Laboratory for Language and Media Processing, Institute for Advanced Computer Studies, University of Maryland, 3451 AV Williams Building</s1>
<s2>College Park, MD 20742</s2>
<s3>USA</s3>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Maryland</region>
<settlement type="city">College Park (Maryland)</settlement>
</placeName>
<orgName type="university">Université du Maryland</orgName>
<placeName><settlement type="city">College Park (Maryland)</settlement>
<region type="state">Maryland</region>
</placeName>
<orgName type="university" n="3">Université du Maryland</orgName>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">IEEE transactions on pattern analysis and machine intelligence</title>
<title level="j" type="abbreviated">IEEE trans. pattern anal. mach. intell.</title>
<idno type="ISSN">0162-8828</idno>
<imprint><date when="2008">2008</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">IEEE transactions on pattern analysis and machine intelligence</title>
<title level="j" type="abbreviated">IEEE trans. pattern anal. mach. intell.</title>
<idno type="ISSN">0162-8828</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Artificial intelligence</term>
<term>Calibration</term>
<term>Character recognition</term>
<term>Curvature</term>
<term>Data flow processing</term>
<term>Document analysis</term>
<term>Frontal</term>
<term>Metric</term>
<term>Optical character recognition</term>
<term>Pattern analysis</term>
<term>Perspective projection</term>
<term>Rectification</term>
<term>Scanner</term>
<term>Texture</term>
<term>Texture analysis</term>
<term>Tridimensional image</term>
<term>Video cameras</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Intelligence artificielle</term>
<term>Analyse forme</term>
<term>Reconnaissance optique caractère</term>
<term>Reconnaissance caractère</term>
<term>Image tridimensionnelle</term>
<term>Texture</term>
<term>Traitement flux donnée</term>
<term>Caméra vidéo</term>
<term>Courbure</term>
<term>Analyse documentaire</term>
<term>Analyse texture</term>
<term>Rectification</term>
<term>Scanneur</term>
<term>Projection perspective</term>
<term>Frontal</term>
<term>Métrique</term>
<term>Etalonnage</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr"><term>Intelligence artificielle</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Compared to typical scanners, handheld cameras offer convenient, flexible, portable, and noncontact image capture, which enables many new applications and breathes new life into existing ones. However, camera-captured documents may suffer from distortions caused by a nonplanar document shape and perspective projection, which lead to the failure of current optical character recognition (OCR) technologies. We present a geometric rectification framework for restoring the frontal-flat view of a document from a single camera-captured image. Our approach estimates the 3D document shape from texture flow information obtained directly from the image without requiring additional 3D/metric data or prior camera calibration. Our framework provides a unified solution for both planar and curved documents and can be applied in many, especially mobile, camera-based document analysis applications. Experiments show that our method produces results that are significantly more OCR compatible than the original images.</div>
</front>
</TEI>
<affiliations><list><country><li>États-Unis</li>
</country>
<region><li>Maryland</li>
<li>Washington (État)</li>
</region>
<settlement><li>College Park (Maryland)</li>
</settlement>
<orgName><li>Université du Maryland</li>
</orgName>
</list>
<tree><country name="États-Unis"><region name="Washington (État)"><name sortKey="Jian Liang" sort="Jian Liang" uniqKey="Jian Liang" last="Jian Liang">JIAN LIANG</name>
</region>
<name sortKey="Dementhon, Daniel" sort="Dementhon, Daniel" uniqKey="Dementhon D" first="Daniel" last="Dementhon">Daniel Dementhon</name>
<name sortKey="Doermann, David" sort="Doermann, David" uniqKey="Doermann D" first="David" last="Doermann">David Doermann</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000D16 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000D16 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:08-0175226
   |texte=   Geometric Rectification of Camera-Captured Document Images
}}

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024

	Serveur d'exploration sur l'OCR
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur l'OCR

Geometric Rectification of Camera-Captured Document Images

Geometric Rectification of Camera-Captured Document Images

Source :

Descripteurs français

English descriptors

Abstract

Links toward previous steps (curation, corpus...)

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri